feat(topic-tree): hierarchical concept topic tree — experimental POC (flag-gated)#149
Draft
KylinMountain wants to merge 14 commits into
Draft
feat(topic-tree): hierarchical concept topic tree — experimental POC (flag-gated)#149KylinMountain wants to merge 14 commits into
KylinMountain wants to merge 14 commits into
Conversation
Convert assets/openkb-architecture.png (1.78MB) to WebP at q85 (~195KB, -89%) and update the README reference. Visually lossless for the diagram's text and line work.
…c-tree enabler) Claude-Session: https://claude.ai/code/session_018WiFnTo1YW9mtw47Fzir9K
… navigable); deterministic bootstrap order Claude-Session: https://claude.ai/code/session_018WiFnTo1YW9mtw47Fzir9K
…concepts Real e2e (reindex over 7 cross-domain docs) surfaced 126 broken links: concept bodies use the compiler's [[concepts/<stem>]] form, which stopped resolving once reindex nested the files. Emit a path-independent concepts/<stem> target too. Claude-Session: https://claude.ai/code/session_018WiFnTo1YW9mtw47Fzir9K
Cluster the full concept set top-down and recurse into oversized groups, instead of greedy one-by-one placement that froze the high-level taxonomy on the first ~FANOUT_K concepts. Drop choose from bootstrap; place_concept/split_node remain for the future live-incremental path. Per spec section 12. Claude-Session: https://claude.ai/code/session_018WiFnTo1YW9mtw47Fzir9K
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
Turns the flat
wiki/concepts/into a topic tree — internal nodes are topic pages (_topic.mdwith an LLM-written summary), leaves are concept pages — that compile/query can descend instead of enumerating everything. Same vectorless, reasoning-based idea as PageIndex/ConDB, applied to OpenKB's own concept graph.How to try it
What's included
openkb/topic_tree.py— generic engine:read_topic/place_concept/split_node/ top-downbootstrap(global cold-start seed) +place_topic_dir. Pure, unit-tested; LLM decisions are injected callables.openkb/topic_tree_llm.py— LLM-backedchoose/cluster/summarize.openkb/lint.py— wikilinks resolve by bare stem andconcepts/<stem>recursively, so links survive a concept being moved into a topic dir (Obsidian-style).openkb/agent/{query,tools}.py—read_topictool + tree-descent instructions (flag-gated).openkb/cli.py—openkb reindex(flag-gated bootstrap).openkb/agent/compiler.py—_write_concept(topic_dir=…)seam.Status
read_topictool, and a full-stack e2e.reindex→ 7 clean domain branches, 0 broken links, coherent topic summaries.Deferred (follow-ups)
_compile_concepts(so new docs compile sub-linearly without areindex). Bigger: it cascades into_add_related_link/_backlink_*(flat-path assumptions). For now: compile flat →reindex.bootstrapclusters all briefs in one call (fine ≤ a few hundred; needs sampled/hierarchical seeding for 1M+); incremental + local merge/rebalance; bottom-up summary refresh; ConDB as the scale-tier retrieval backend (feat: OpenKB MVP — Karpathy's LLM Knowledge Base, powered by PageIndex #4). Design notes for these are kept local (repo rule: specs out of git).Note
This branch also carries one unrelated local commit —
cdac90d docs: compress architecture diagram to WebP(yours, never pushed). I'll lift it ontomainseparately so the PR is just the topic-tree commits before this goes out of draft.https://claude.ai/code/session_018WiFnTo1YW9mtw47Fzir9K